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Abstract 

In this paper, we specify a class of mathematical problems, which we refer to as "Function Density 
Problems" (FDPs, in short), and point out novel connections of FDPs to the following two cryptographic 
topics; theoretical security evaluations of keyless hash functions (such as SHA-1), and constructions of 
provably secure pseudorandom generators (PRGs) with some enhanced security property introduced by 
Dubrov and Ishai [STOC 2006]. Our argument aims at proposing new theoretical frameworks for these 
topics (especially for the former) based on FDPs, rather than providing some concrete and practical 
results on the topics. We also give some examples of mathematical discussions on FDPs, which would be 
of independent interest from mathematical viewpoints. Finally, we discuss possible directions of future 
research on other cryptographic applications of FDPs and on mathematical studies on FDPs themselves. 
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1 Introduction 

1.1 Background and related works 

It is widely understood that some mathematical problems have been playing indispensable roles in research 
on cryptography and information security. For instance, the (expected) computational difficulty of integer 
factorization is the source of security of RSA cryptosystem [8], while the problem of solving multivariate 
quadratic (MQ) equations has attracted several studies after the development of Matsumoto-Imai cryp- 
tosystem [2| and its variants, whose constructions are closely related to MQ equations. Hence, posing and 
studying an interesting mathematical problem which arises in certain cryptographic settings can contribute 
to the progress of cryptography and information security. 

The aim of this paper is to emphasize the significance of a certain mathematical problem, which has 
connections to the following two major topics in information security; security analysis of keyless hash 
functions in the real world (such as MD5 and SHA-1), and construction of pseudorandom generators (PRGs) 
with some enhanced security property. First, we give some descriptions of these two topics. 



Security analysis of keyless hash functions. Intuitively, a hash function is a function H : X — > Y 
from some (finite) set X to another (finite) set Y that possesses a certain desirable security property. When 
we concern efficiency or computability of H, we consider an algorithm that computes H (also denoted by 
H) and call it a hash algorithm. One of the standard security requirements for hash functions is collision 
resistance, which informally means that it is difficult to find a collision pair (x\,x%) for H, i.e., x\ ^ X2 G X 
satisfying H{x\) = H(x2). Hash functions have been playing central roles in various information security 
applications, and secure hash functions for real-life applications are usually expected to possess the collision 
resistance property. 

However, most of the preceding successful studies that show security of hash functions actually dealt 
with keyed hash functions (or hash families)] intuitively, a family of hash functions parameterized by 
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a key k is called collision resistant if, for any (efficient) adversary, the attack to find a collision pair of Hk 
fails with high probability for a randomly chosen key k. Several constructions of keyed hash functions have 
been proposed so far (e.g., [5]). The above security notion of keyed hash functions can be interpreted as 
allowing one to (randomly) choose a concrete instance of the hash family after an adversary is given. 
In contrast, in most of real-life applications, the concrete instance of hash algorithms is specified first (for 
example, by a standardization), and then an adversary can try to attack the fixed hash algorithm. This 
reversal of order causes a crucial difficulty in guaranteeing (or even formalizing in a reasonable manner) 
security of a keyless hash algorithm H, as (unless the trivial situation where the domain of H is not larger 
than the image of H) there does always exist a collision pair (x\,X2) for H and any adversary (existing in 
theory) who innately knows the pair (xi, X2) is obviously able to efficiently attack the fixed hash algorithm 
H. In fact, even an instance of standardized (or de facto standard) hash algorithms, whose security must be 
evaluated well before the standardization, has been suffered from feasible attacks (e.g., [10]). In this paper, 
we try to propose a theoretical and unified way to say something, preferably affirmative, about security of a 
concrete (keyless) instance of hash algorithms. 

For related works, Rogaway [5] gave a detailed observation about the difference between "inexistence of 
effective attack algorithms" and "lack of knowledge on construction of effective attack algorithms" for keyless 
hash algorithms. He emphasized the difference of the two situations (by the term "human ignorance"), and 
discussed how to prove security of a cryptographic protocol by reducing the security into "lack of knowledge 
on concrete attacks" on the hash algorithm internally used by the protocol. However, he did not discuss 
how to theoretically evaluate security of keyless hash algorithms themselves, which we study in this paper. 
On the other hand, in this paper we adopt concrete security formulation rather than asymptotic one; while 
some observation for security of keyless hash algorithms in asymptotic security formulation is also given in 
Rogaway 's paper. 

Construction of enhanced PRGs. A PRG is an algorithm G: S — »■ X with (finite) set S of inputs 
(seeds) and (finite) output set X with the property that, when a seed s E S is chosen uniformly at random, 
the output G(s) E X of G is also "random" in some sense. Conventionally, the meaning of "randomness" 
here is formulated by using the notion of distinguisher, which is an algorithm D : X — y {0,1} with 1-bit 
output and the input set being the output set X of G. In this paper we adopt concrete security formulation 
rather than asymptotic one, in which case the security requirement for PRGs can be formulated as (T, e)- 
security; namely, G is called (T, e) -secure [4] if, for any distinguisher D for G with (time) complexity bounded 
by T, the statistical distance between the output distribution D(G(Us)) of D with input given by G with 
uniformly random seed sgS (referred to as "pseudorandom input") and the output distribution D(Ux) of 
D with uniformly random input x E X (referred to as "random input" ) is bounded by e. (Intuitively, any 
such D cannot distinguish the random element x and the pseudorandom element G(s) in X with significant 
advantage.) There are a large number of constructions of PRGs, most of which are provably secure (possibly 
in asymptotic security formulation) under standard computational assumptions (e.g., [HE]). 

On the other hand, in a preceding work of Dubrov and Ishai [3], an enhanced notion for PRGs, called 
pseudorandom generators that fool non-boolean distinguisher s (nb-PRGs, in short), was proposed. This 
notion is obtained by allowing the distinguishers D in the above security notion to have larger output sets; 
namely, G is called (T,n,e) -secure if, for any "non-boolean" distinguisher D : X — y Y for G with (time) 
complexity bounded by T and output set Y of size at most n, the statistical distance between the output 
distributions of D with random and pseudorandom inputs is bounded by e. Dubrov and Ishai showed 
interesting applications of nb-PRGs, e.g., secure pseudorandomization of a certain kind of information- 
thcoretically secure protocols without any restriction on computational complexity of the adversary's attack 
algorithm. 

However, constructing secure nb-PRGs seems much more difficult than the case of the usual PRGs. 
Indeed, to the authors' best knowledge, the only constructions of nb-PRGs proposed so far are ones in the 
original paper [3], which are based on certain less standard computational assumptions. Hence it will be 
fruitful if we can give some results implying that any usual PRG (with some parameter) is also an nb-PRG 
(with a possibly different parameter). In fact, a straightforward implication has been mentioned in [3], but 
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this is far from being efficient (i.e., to obtain nb-PRGs with reasonable security parameters, the original 
PRGs are required to have somewhat impractical security parameters). In this paper, we try to establish a 
more efficient implication result. 

1.2 Our contributions, and organization of this paper 

In Section[2j we propose a class of mathematical problems, which we refer to as "Function Density Problems" . 
Intuitively, this problem is to evaluate the possibility of close approximations of arbitrary functions by using 
some "easily describable (or analyzable)" functions. 

Then we introduce motivating applications of Function Density Problems to two topics in information 
security. First, in Section [3J we discuss theoretical analysis of collision resistance of keyless hash algorithms. 
We give an abstract framework for attacking a given hash algorithm by using known attacks on some other 
"easily breakable" hash algorithms. In the framework, it is essential to evaluate how closely a target hash 
algorithm can be approximated by "easily breakable" hash algorithms; thus Function Density Problems play 
a significant role in the security evaluation of hash algorithms. 

Secondly, in Section [H we study an enhanced security notion for PRGs (called nb-PRG) introduced by 
Dubrov and Ishai [3j. We give some implication results showing that any secure PRG with some parameter 
is also a secure nb-PRG with somewhat modified security parameter. In the results, the overheads in the 
bounds of (time) complexity and of advantages for the distinguishers are in trade-off relations, and Function 
Density Problems can be applied to evaluate to what extent the trade-off will be improved by our proposed 
result. 

Then, in order to arise some image or intuition of how Function Density Problems can be mathematically 
studied, in Section [5] we give some concrete examples of mathematical discussions on Function Density 
Problems themselves, using combinatorial and geometric arguments and techniques in Grobner bases. In 
particular, we deal with special cases where the set of "easily describable (or analyzable)" functions forms 
a linear subspace (related to low-degree boolean functions, perfect linear codes and Reed-Solomon codes), 
which would be of independent interest from mathematical viewpoints. 

Finally, in Section [6] we give a concluding remark, which includes discussions on further possible appli- 
cations of Function Density Problems in information security, and on possible directions of future research 
on Function Density Problems themselves. 

2 Function Density Problems 

In this section, we specify a class of mathematical problems, which we call Function Density Problems 
(FDPs) in this paper. As the class of FDPs in a most general form will include too various problems to 
obtain meaningful insights for their properties, it is significant to restrict the class suitably according to each 
situation under consideration. Relations of FDPs to some concrete topics in cryptography will be shown in 
the following sections. 

We give a general description of our problem: 

Definition 1 (Function Density Problems). Let C be a set of some functions, and let C be a subset of C. 
Let d(-, •) be a distance function for the pairs of functions in C. In this setting, we define a Function Density 
Problem to be a problem of estimating the following quantity: 

r(C,C) :=sup{rf(/,C) |/eC} , (1) 

where, for each / G C, d(f,C) := ini{d(f,g) | g g C'} is the distance from / to C. (The symbol 'r' stands 
for "radius", by an analogy as if C is a single central point in the figure C, in which case the r is the radius 
of C in usual sense.) 

Among very various situations covered by Definition Q] (where C in fact need not even to be a set of 
functions!), in the applications of FDPs discussed in this paper we will focus on the following typical cases: 
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Definition 2 (Function Density Problems - typical cases). Let C be the set of all functions / : X — > Y from 
a given finite set X to a given finite set Y. Let C C C. For any f,g&C, we define the distance between / 
and g by 

d H (f,g):=\{xGX\f(x)^g(x)}\ . (2) 

In this setting, a Function Density Problem is a problem of estimating the quantity r(C,C) defined by (fTJ) 
with <2(-, •) = d H (-, •)• 

In the case of Definition [21 the "sup" and "inf" in Definition Q] can be simply replaced with "max" 
and "min" , respectively. Moreover, the distance defined by ([2]) coincides with the (generalized) Hamming 
distance when members of C arc identified with sequences of length |X| over the alphabet Y in a natural 
manner. Note that the quantity r(C,C) can be regarded as a special case of so-called Hausdorff distance for 
two subsets of a metric space, which would support that it is reasonable to consider r(C,C). 

An intuitive explanation of a motivation for the above definition is as follows. Given a set C of functions, a 
subset C consists of members of C which are in some sense "easily analyzable" or "with simple descriptions" . 
The distance d(f,g) measures how two functions / and g are similar. Then the quantity d(/, C) evaluates 
how accurately a function / G C can be approximated by an "easy" function in C, and the quantity r(C,C) 
evaluates how densely the "easy" functions distribute among the entire set C. In other words, when r(C,C) is 
revealed to be small, it shows potential availability of a close approximation of any member of C by an "easy" 
function in C. For example, in the case of Definition [2] any function / G C can in principle be converted 
into some function g G C by changing the values f(x) for at most r(C,C) points x G X. (We emphasize 
that it does not mean that a close approximation of / by a function in C can be efficiently computable. 
Such a difference between existence and efficient computability is also relevant to a preceding observation 
for "human ignorance" by Rogaway [5].) 

3 Hash functions and FDPs 

In this section, we point out a relation of FDPs introduced in Section [2] to security analysis of keyless hash 
functions. Here we propose a new framework for theoretical security evaluation of keyless hash functions 
based on FDPs. Although theoretical security evaluation of keyless hash functions is evidently an extremely 
difficult problem and our proposed framework is unfortunately not yet practical, we hope that our framework 
can be a clue to this problem. 

We consider a keyless hash function H : X — > Y with possibly large but finite domain X and relatively 
small (finite) range Y. Among the major security requirements for hash functions, we focus on the collision 
resistance of H; we discuss how it is difficult to find a collision pair (xx,x 2 ) for H (recall that (xi,x 2 ) is 
called a collision pair for H if we have x\, x 2 G X, x\ ^ x 2 and H{x\) = H(x 2 )). To show the relevance of 
FDPs to this problem, first we give a somewhat informal description of an abstract "typical" strategy for 
finding a collision pair: 

1. Construct a close approximation H' : X — > Y of H in such a way that collision pairs for H' can be 
found with reasonable computational time. 

2. Find randomly a collision pair (x\,x' 2 ) for H'. 

3. Construct from (x^x^) a candidate (xi,x 2 ) of a collision pair for H (in the simplest case, we just set 
(xi,x 2 ) = (x' 1 ,x' 2 )). 

4. Check if (xi, x 2 ) is a collision pair of H; if it is indeed a collision pair of H, then output (x\, x 2 ) and 
stop the process. 

5. If (xx, x 2 ) is not a collision pair of H, go back to Step © and repeat the process. 

Intuitively, the number of iterations in the above strategy before finding a collision pair for H would be 
expected to be small if the approximation H' is sufficiently close to H (see Lemma [1] below for a quantitative 
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expression of this expected tendency). Hence security of a hash algorithm H against such an attack strategy 
is related to the possibility of finding its close approximation. 

More precisely, we set {x\,X2) = {x'^x'^j in the above strategy for simplicity. We consider the case 
of Definition [21 and let C be a subset of C with the property that any hash function H' in C admits an 
efficient attack (finding a collision pair) by a certain known attack strategy. In the above attack strategy, the 
approximation H' for H specified in Step ([!} is supposed to be chosen from C . Now we have the following 
lemma: 

Lemma 1. Suppose that H and H' are functions X — > Y with \Y\ — n > 2, and d^H, H') — d, < d < \X\. 
Then the probability that a collision pair for H' , which is chosen uniformly at random from the set of all 
collision pairs for H', is also a collision pair for H is not lower than 

2qq|X| - n(a + l)ao - 2dap 

2a \X\ + 2d\X\ - n(a + l)a - 2da - d? - d ' [ ' 

where ao = [(\X\ — d — l)/n\. Moreover, when \X\ > d + (n — 1) 2 ; the value in ([3]) is getting larger as d 
becomes smaller. 

A proof of Lemma [T] will be provided in the last of this section. Now let us imagine the following situation. 
Two candidate sets C\ , C 2 for a new standard hash function are given, and we can specify subsets C[ C C\ 
and C 2 C C 2 in such a way that each C[ (i = 1,2) consists of some hash functions for which collision pairs 
can be found in reasonable computational time by using some known techniques. We suppose that r(Ci,C[) 
is significantly small and r(C 2l C 2 ) is significantly large. Then any hash function H chosen from C\ can be 
potentially attacked by just finding a close approximation H' G C[ of H (using some expert's sixth sense, 
for example) and applying the above attack strategy combined with known collision finding techniques. On 
the other hand, Ci contains at least one hash function H for which the above attack strategy combined with 
any known collision finding technique will not succeed. This would suggest that it can be potentially safer 
to choose a new hash function from C 2 rather than C\ , as we already know the potential attack on any hash 
function in C\ but not the same for C 2 - 

The authors hope that studies of FDPs can contribute to security analysis of keyless hash functions in 
the above manner, though how to specify the subset C in practical cases is of course a big problem to be 
concerned. One may also feel that it seems infeasible to compute the quantity r(C,C) for practical classes 
of hash functions; even if so, some estimate of a bound or tendency of r(C,C) would still give us an insight 
into the security level of those hash functions. 

Remark 1. Here we notice that, although we have focused on the collision resistance in the above argument, 
a similar idea would also be applicable to other security notions for keyless hash functions, such as the 
(second) preimage resistance. 

To conclude this section, we give a proof of Lemma [TJ 

Proof of Lemma\j] We write {m) 2 := m(m — 1) for any integer m. Put Y := {yi, 
1 < i < n, put 

a, := \{x e X I H'{x) = y t }\ , h := \{x 6 X \ H(x) ± H'(x) = Vi } 

Moreover, put 

n n 

ipi(a; b) := ^(a 4 ) 2 , tpi{a; b) := - M2 , 

i=l i=l 

where a := (ai, . . . , a n ) and b := (b\, . . . , b n ). Then the number of collision pairs for H' is ifi(a; b), while the 
number of collision pairs for H is at least tp 2 (a; b) . Therefore the probability specified in the statement of 
Lemma [T] is at least 

/-. f2(a;b) 
VWb) ■= — — 7C • ( 6 ) 
<pi[a;b) 

From now, we give a lower bound for the values of tp under the following conditions implied by the definitions: 
< bi < ai for each i, 52»=i a i ~ l-^1> ano - S"=i ^» = ^- ^ or ^ ne purpose, we show the following two lemmas: 



, y n } , and for each 
(4) 



(5) 
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Lemma 2. In the above setting, if the minimum value of the function ip is attained by a and b, then we 
have hi > for a unique index i, and ai — bi > aj for every index j ^ i. 

Proof. If we have i =fi j and bi,bj > 0, and we suppose < aj by symmetry, then we have 

((<* - 1) 2 + (aj + 1)2) - ((oi)a + (0,0a) = 2(oj - a* + 1) > , (7) 

therefore the value of ipi increases when ai, aj, bi and bj are replaced with <Zj — 1, aj + 1, bi — 1 and fry + 1, 
respectively. On the other hand, the value of <p 2 is not changed by this replacement. Therefore the value of 
ip is decreased by this replacement, contradicting the assumption on the choice of a and b. Hence an index 
i with bi > is unique, therefore bi = d. Similarly, if j ^ i and ai — bi < aj, then we have 

((a, -bi + 1)2 + (aj - 1)2) - ((a, - b % ) 2 + (aj) 2 ) = 2(a 4 - b t - aj + 1) < , (8) 

with equality holding when and only when ai — bi = aj — 1. This implies that the value of tp at the a and b 
is larger than or equal to the value of tp with bi and bj (— 0) being replaced with bi — 1 and 1, respectively, 
where the equality holds if and only if ai — bi = aj — 1 . As the former value is assumed to be the minimum, 
the equality condition ai ~ bi = aj — 1 should hold. Moreover, if bi — 1 > 0, then the latter value of ip 
(which is now equal to the former) cannot be the minimum by the above argument, which also leads to a 
contradiction. Hence we have bi — 1 (therefore d — 1) and ai — aj. Now we have 

((at + 1)2 + (aj - 1)2) - (K) 2 + (flj) 2 ) = 2(a, - aj + 1) > . (9) 

This implies that the value of ip will decrease when a t and aj are replaced with ai + 1 and a 3 — 1 , respectively, 
contradicting the assumption that the former value is the minimum. Hence we have — bi > aj for every 
j ^ i, concluding the proof of Lemma [21 □ 

Lemma 3. In the above setting, if the minimum of the function ip is attained by a and b, then we have 
\a% — &j| < 1 for any pair of indices i ^ j satisfying bi = bj =0. 

Proof. Assume contrary that — aj > 2 for such a pair of indices i ^ j. For I £ {1,2}, let ap denote the 
value of ipi at the a and b, and let fig denote the value of ipi with ai and aj being replaced with ai — 1 and 
flj + 1, respectively. Then we have f3\ — a\ = /?2 — "2 = 2(a 3 — a.; + 1) < 0. On the other hand, for the 
unique index i' with by > (see Lemma [2]), we have ay > by + a t > by +aj + 2 > 2 by the assumption and 
Lemma [21 therefore u\ > Now we present the following lemma, which is proven by an easy calculation: 

Lemma 4. If p > q > and r > 0, then q/p < (q + r)/(p + r). 

By using this lemma, we have 

"2 _ /3 2 ~ 2(qj - aj + 1) fa 

ai ~ ft - 2( aj -ai + 1) f3x ' ( ' 

contradicting the assumption that aij ot\ is the minimum of the value of tp. Hence Lemma [3] holds. □ 

By Lemma [5] and Lemma [31 the points a and b that attain the minimum of ip satisfy the following 
conditions: bi > for a unique i, and there is an integer a satisfying that — bi > a + 1 and aj G {a, a + 1} 
for every j ^ i. Note that this a can be taken as a > 0; indeed, this is obvious if some aj with j ^ i is positive, 
while the remaining possibility that aj = for every j ^ i allows us to choose a = as a.; = \X\ > d = bi 
and ai — bi > 1. Let k be the number of indices j ^ i with et^ — a + 1, therefore < k < n — 1. Then we have 
a,; = |A| — (n — l)a — k, while bi = d, therefore the condition ai — bi > a + 1 implies that k < |A| — na — d—1. 
Now we write the values of ipi and ipi in this case as ip\(ct, k) and </?2(a, respectively. Then we have 

ipi{a, k) = k(a + 1) 2 + (n - 1 - fc)(a) 2 + (aj)a , 
ip 2 (a, k) = k(a + 1)2 + (n — 1 — fc)(a) 2 + (&! — d) 2 , 
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therefore fi(a, k) ~ ip 2 {a,, k) — 2da^ — d 2 — d. Now by Lemma HI we have 

(p 2 (a,k) 2da,i - d 2 - d < 2da, - d 2 - d + 2d((n - l)a + k) 
tpi(a,k) <Pi(a,k) ~ tpi(a,k)+2d((n-l)a + k) 

2d\X\ -d 2 -d [ ' 

^ k(a + 1) 2 + (n - 1 - k){a) 2 + (04)2 + 2d(n - l)a + 2<i/c 

(note that 2d((n — l)a + k) > as a > 0). Let VK* 3 ) denote the denominator of the right-hand side. Then, 
by virtue of the property Jjra,; = — 1, we have 

d 

—Ma, k) = (a+ 1), - (a) 2 - (2a, - 1) + 2d = 2a - 2a; + 1 + 2d < (13) 
ok 

(note that a; — d > a + 1), therefore ip{a,k) is decreasing as fc is increasing. On the other hand, we have 
ip(a, n — 1) = t/>(a + 1, 0). Now note that a < (\X\— d — l)/n as < k < \X\ — na — d — 1. This implies that 
ip(a, k) takes the minimum value at a — \_(\X\ — d — \)/n\ = ao and k = fco := \X\ — nao — d — 1 (note that 
ko < 77. — 1). Moreover, we have at = ao + d + 1 if a — ao and k = kg. Hence a straightforward calculation 
shows that 

<p 2 (a,k) < 2rf|X| - d 2 - d 

pi (a, fc) ~ 7/>(a ,fc ) . , 

2d|X|-d 2 -d 1 ' 



2a Q \X\ + 2d\X\ - n{a + l)a - 2da - d 2 - d ' 



therefore 



^ 2 (a,fc) > 2d|X 



<pi(a, fc) ~ 2a |X| + 2d\X\ - n(a Q + l)a - 2da - d 2 - d 
2ao|X| — n(ao + l)«o — 2dao 
~ 2a \X\ + 2d\X\ - n(a + l)a - 2da - d 2 - d ' 

which proves the lower bound ([3]) in the statement of Lemma [TJ 

Finally, suppose that d > 2, and let 771 (d) and 772(d) denote the denominator and the numerator in (|3|). 
respectively. For any value x depending on d, let A[x] temporarily denote the value of x at d — 1 minus the 
value of x at d. Then we have A(— d 2 — d) = 2d, therefore 

Afoa(d)] = A[2a |X| - n(a + l)a - 2da a ] , 

A[r?i(d)] = A[2a a \X\ - n(a + l)a - 2da Q ] - 2\X\ +2d < A[r) 2 (d)] . ' ( " 

Moreover, we have A[ao] G {0,1}, and if A[«o] = 0, then A [772(d)] = 2d«o > 0. On the other hand, if 
A[ao] = 1, then we have 

A[(ao + l)ao] = (a + 2)(a + 1) - (a + l)a = 2(a Q + 1) , 
A[2da ] = 2(d - l)(a + 1) - 2da = 2d - 2a Q - 2 , 



(17) 



therefore 



A[?7 2 (d)] = 2\X\ - 2n(a Q + 1) - 2d + 2a + 2 
= 2\X\ - 2(n - l)a -~2n-2d + 2 

> 2\X\ - 2(n - ~ d ~ 1 _ 2n - 2d + 2 

71 

= - (\X\ -d + 2n- 1-n 2 ) > 



(18) 



(where we used the assumption \X\ > d + (n — l) 2 ). Now by Lemma 0J we have 

m(d-l) = 772(d) + A [772(d)] 772(d) *M , , 

?7i(d-l) 771(d) +A[77i(d)] " 771(d) +A[77i(d)] - A[77 2 (d)] 771(d)' 1 1 

Hence the proof of Lemma Q] is concluded. □ 
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4 PRGs and FDPs 



As our second application of FDPs, in this section we present some results which prove that any (computa- 
tionally indistinguishable) PRG with some parameter is also an nb-PRG with a (possibly different) specified 
parameter. The concrete relations between parameters for an algorithm as a PRG and as an nb-PRG, 
respectively, will be determined by applying FDPs. 

First we recall the security notion for PRGs. We emphasize that, for the sake of simplicity, here we 
adopt definitions in forms of concrete security rather than asymptotic security. Let Ux denote the uniform 
probability distribution over a finite set X. 

Definition 3 (see e.g., [4]). Let G: S — > X be an algorithm with finite input set S and finite output set X. 
Given parameters T > and e > 0, G is called a (T,e)-secure pseudorandom generator (PRG) if, for any 
algorithm (called a distinguisher) D : X — > {0, 1} with time complexity bounded by T, we have Advc(G) < e 
where Adv d(G) denotes the advantage of D defined by 



Let A(Pi,P2) denote the statistical distance of two probability distributions Pi, Pa over the same finite 
set Z defined by 



Then the advantage Adv£>(G) of a distinguisher D defined above is equal to A(D(Ux), D(G(Us))), as both 
D(Ux) and D(G(Us)) are probability distributions over {0, 1}. This interpretation of the advantage gives 
us a motivation to enhance the above security notion of PRGs, as in the following definition introduced by 
Dubrov and fshai [3 (with slightly different formulation): 

Definition 4 ( 3 ). Let G: S — > X be an algorithm with finite input set S and finite output set X. 
Given parameters T > 0, e > and an integer n > 2, G is called (T,n,e) -secure if, for any algorithm 
(distinguisher) D : X — > {0, 1 , . . . , n — 1} with time complexity bounded by T, we have Adv£>(G) < e where 
we put Adv/)(G) := A(D(Ux), D(G(Us)))- Such an algorithm G is called a PRG that fools non-boolean 
distinguisher s (nb-PRG, in short). 

Note that (T, 2, e)-security is equivalent to (T, e) -security in Definition [3] Several applications of nb- 
PRGs are discussed in [3]. For example, it was shown that randomness used in some kinds of information- 
theoretically secure protocols (such as multi-party computation of certain types) can be replaced with outputs 
of nb-PRGs, without any restriction on computational complexity of the adversary against the protocol. 
However, despite the significance of nb-PRGs mentioned above, it seems much more difficult to construct 
secure nb-PRGs than the case of usual PRGs against f-bit output distinguishers. Indeed, to the authors' 
best knowledge, the only constructions of nb-PRGs in the literature so far are the ones by Dubrov and Ishai 
themselves in the original paper [3], and their construction is based on certain computational assumption 
which is less standard than those used in constructions of usual PRGs. Hence, it is worthy to investigate a 
method to construct nb-PRGs (under standard computational assumptions). 

Our proposal here is to establish a general theorem of the following form: Any (T', e')-secure PRG is 
also a (T, n, e)-secure nb-PRG, where the parameters T 1 and e' as a usual PRG are determined by T, n 
and £ in a certain manner. Such an implication result is evidently meaningful, as it enables us to convert a 
large number of existing PRGs under standard assumptions into nb-PRGs. In fact, an implication relation 
as above has been mentioned (without proof) in [3 . Our aim here is to improve the preceding relation by 
introducing the idea of FDPs. 

The above-mentioned relation is derived from the first expression (f2"Tj) of statistical distance, in the 
following manner (which refers to a description in [7])- We introduce some notations. Put Y := {0, 1, . . . , n — 



Adv D (G) := \Pr[D(U x ) = 1] - Pr[D(G(U s )) = l]\ 



(20) 




(21) 



2£Z 



max |Pr[Pi G E] - Pr[P 2 G E}\ 

E(ZZ 



(22) 
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1} for simplicity. For any subset Z C Y, let xz ■ Y — > {0, 1} denote the characteristic function of Z defined 
by Xz(x) = 1 if x G Z and xz{x) — if x G Y \ Z. We write \ z — X{z} for simplicity when Z — {z}. In 
this setting, for any PRG G: S — >• X and any non-boolean distinguisher D : X — > Y, the statistical distance 
A(D(Ux),D(G(U s ))) is equal to 

\ £ \Pr[D{U x ) =y}- Pr[D(G{U s )) = y]\ 

= \ E \ Pr ^v ° D ( U x) = !] - ° ^( G (^)) - 1]| (23) 

where Xy°D denotes an algorithm performed by first executing the distinguisher D and then evaluating the 
output of D by the function Xy An important property is that Xy °D is a 1-bit output algorithm, therefore 
it can be regarded as a distinguisher for the PRG G. This implies that, to show that a (T", e')-secure PRG G 
is also a (T, n, e)-secure nb-PRG, it suffices to choose the parameters as T" = T + Si and e' = 2e/n, where Si 
is the maximum of the overhead in computational complexity of composing some Xy iu £ Y) to D (usually, 
Si can be set to be almost zero in practical situations). In other words, we have the following proposition 
(which has been mentioned in [3]): 

Proposition 1. In this setting, any (T + Si, 2e/n) -secure PRG is also (T,n,s) -secure, where the quantity 
Si is defined in the above manner. 

A drawback of this result is that, in practical applications the parameter n (which is relevant to the 
allowable input size for an adversary against a protocol under consideration) should frequently be large, 
which makes the overhead in a bound of advantage in Proposition [I] too heavy. We try to resolve the 
drawback by improving or modifying the above result. 

Our first idea is to use the second expression (|22|) of statistical distance instead of the first one ([21]) used in 
the preceding argument. Namely, in the same setting as above, the statistical distance A(D(Ux), D(G(Us))) 
is equal to 

max|Pr[D([/ x ) G Z] - Pr[D(G(U s )) G Z]\ 

= max \Pr\xz ° D(U X ) = 1] - Pr[ X z ° D(G(U S )) = 1]| (24) 
= maxAdv Xz oD(G) • 
In the same way as Proposition [TJ the above argument implies the following result: 

Proposition 2. In this setting, any (T + 82, e)- secure PRG is also (T 1 n,e)-secure, where S2 is the maximum 
of the overhead in computational complexity of composing some xz with Z C Y := {0, 1, . . . , n — 1} to D. 

In contrast to Proposition (TJ there exists no overhead for a bound of advantage e in Proposition [2] 
However, instead, the overhead 82 for a bound of time complexity of distinguishers is expected to be too 
heavy, as the set Y (of somewhat large size) may contain an extremely complicated subset Z ', for which the 
computation of xz would be inefficient. 

From now, we try to improve the above-mentioned trade-off between overheads for bounds of advantage 
and of computational complexity, by applying the idea of FDPs. Put Y := {0, 1, . . . , n — 1} as above, and let 
C be the set of characteristic functions xz:^^{0,l}for subsets Z C Y, and let d = dn (see |2j). Then for 
XYi , Xy 2 G C , dn (xyi , Xy 2 ) is equal to the size of the symmetric difference Yj. Q Y2 : = {Yi\Y 2 ) D (Y 2 \Yi) of 
two subsets Yj and Y 2 . Now we fix a subset C of C. Let 83 be the maximum of the overhead in computational 
complexity of composing some xz G C to D. Moreover, we put r := r(C, C) for simplicity. Then we have the 
following result (we notice that, when C — the theorem gives almost the same result as Proposition 

UK: 
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Theorem 1. In the above situation, let 8\ be as specified in Proposition^ If G: S —> X is (T + 5i, £\)-secure 
and (T + S3, £3) -secure, then G is also (T,n,re\ + e 3) -secure. 

Proof. For each distinguisher D: X — > Y := {0, 1, . . . , n — 1}, we write := Pr[D(Ux) G Z] and 

IJ.'{Z) := Pr[D{G(Us)) G Z] for a subset Zc^. Let Y be a subset of Y that attains the maximum of the 
second expression (|22p of the statistical distance; 

A(D(C/x), D(G(U S ))) = HYo) - n'(Y )\ . (25) 

Note that can be chosen in such a way that /i(lo) — m'O'o) — this inequality fails, use Y \ Yo instead 
of Yq), therefore 

A(D(U x ),D(G(Us))) = n(Yo) - l*'(Y ) . (26) 

Moreover, by the definition of r, there is a subset Y\ C Y satisfying that \y x G C and dn (xy 1 XY\ ) = 
\Yq © ^1 1 < r. Now we have 

iy(r ) - v<Xi) = v{Yo \ Yx) - v{Y x \ Y ) for each v g fa //} , (27) 

therefore we have 

(M^-m'^-CmW-m'^i)) 

= (/i(y ) - KYi)) - (m'(^o) - M'(^i)) (28) 

= ( M (y \ y) - ii'(y \ y)) - ( M (y \ y ) - M '(y \ y)) . 

Moreover, the right-hand side is equal to 

E E (MM)-M'({y})) 

< E KM)-M'({y})l 

= E l^[x y °£(^x) = l]-Pr[ X ,°i}(G([/s)) = l]| 

yeYoen 

= E Adv X„oD(G) . 

Now if D has computational complexity bounded by T, then the assumption on G and the definition of 61 
imply that 

E Adv XH0D (G)< E ei = \YoeYi\-ei < re x . (30) 

Summarizing, we have 

(My )- M '(y ))-(My)-A*'(y)) ■ (3i) 

This and (|2l)|) implies that 

A(D(Ux),D(G(U s ))) 

= ( M (y ) - M'(y )) - (M(y) - M'(y )) + (mO-i) - mTO) 

< rei + (Pr[£>(l/x) G *i] - Pr[D(G{U s )) G y]) (32) 

< rei + |Pr[x n o D(U X ) = 1] - Pr[ Xn o D(G(U S )) = 1]| 
= re! + Adv xnoD (G) < rex + e 3 , 

concluding the proof of Theorem [TJ □ 



(29) 
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Regarding the relation between parameters in Theorem [TJ first note that it is natural by the definitions 
to expect that <5i < #3, which allows us to suppose that E\ < £3. Now let us imagine the following situation: 
We can find an appropriate subset C C C in such a way that every characteristic function \z £ C has low 
computational complexity and the quantity r :— r(C,C) is small. In this case, £3 can be small as well as r, 
and it would make the implication relation given by Theorem Q] more efficient than those in Propositions [T] 
and [21 therefore the above-mentioned trade-off is improved. Hence a study of FDPs (in particular, those for 
functions with 1-bit output sets) will contribute to establish a better relation between PRGs and nb-PRGs. 

Remark 2. We mention that, for the two applications of FDPs discussed in the last two sections, a kind 
of "risk-hedging" relation exists as follows. Namely, if we find that the quantity r(C,C) tends to be large 
in general, then it would support the argument in Section [3] to show that keyless hash functions under 
consideration would have better security. On the other hand, if we find that the quantity r(C,C) tends to 
be small in general, then it would support the argument in Section U to show that overheads in parameters 
for nb-PRGs compared to PRGs would be practically small. 

5 Mathematical examples for FDPs 

This section is devoted to describe some examples for mathematical studies of FDPs themselves, rather than 
their cryptographic applications such as ones discussed in Sections [3] and [5] The authors hope that one 
would feel that FDPs themselves are of independent interest as mathematical problems and mathematical 
studies of FDPs will be promoted. 

5.1 Vector spaces and their subspaces: A general bound 

The examples of FDPs discussed below can be interpreted in the following manner. The set C forms a 
finite-dimensional vector space over a finite field F, with a distinguished basis v±, . . . ,11^ where d :— dim(C), 
hence each element of C admits a vector expression. A subset C is a linear subspace of C, and the distance 
d(f, g) is defined to be the (generalized) Hamming distance with respect to the vector expressions of /, g € C. 
In this subsection, we show a general upper and lower bounds of the quantity r(C,C) in this case. Namely, 
we have the following: 

Proposition 3. In the above setting, let I denote the minimal integer I 1 satisfying that X)i=o (i) (1^1 — -0* — 
|jp|codim c (c ) ^ w /j ere codimc(C') denotes the codimension rf — dim(C') of C in C. Then we have £ < r(C,C) < 
codinic(C'). 

Proof. Put d! := dim(C), therefore coding (C) = d — d! . First we prove the lower bound. For each w € C 
and k > 0, put B(w,k) := {w' e C \ d{w,w') < k}. Then we have \B(w,k)\ = ^=o (t)(\ ¥ \ - 0n thc 
other hand, by the definition of r(C,C), we have C C LLec r (C,C')). This implies that 

r(C,C) . . 

\C\<\C'\- j(|F|-ir , (33) 

»=o 

or equivalently |F| d - d ' = \C\/\C'\ < E^,' (^)(|F| - 1)\ Hence we have t < r{C,C) by the choice of £ 

Secondly, we prove the upper bound. By applying Gaussian elimination to any basis of C , it follows that 
there exist a basis u\,..., u^i of C and distinct indices i\, . . . G {1, 2, . . . , d} with the property that, for 
each 1 < j < d' , the coefficient of a basis element Uj. of C in Uj is 1 and the coefficient of Vi j in any other 

u k (k ^ j) is 0. Now for an arbitrary element w = 2<=i c i v i £ C (c^ e F), the above property of U\, . . . , Ud> 
implies that the distance between w and w' := 53y=i c% 3 u j £ C is at most d — d' , therefore d(w, C) < d — d! . 
Hence we have r(C, C) < d — d' , concluding the proof of Proposition [3] □ 

The next result shows how the lower and upper bounds in Proposition [3] are close to each other: 
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Proposition 4. In the setting of Proposition^ we have 

I < codim c (C) < log™ (q(|F| - lfd l /&) 

(34) 

= ^( lo g|F|(l F l - !) + l°g|F| d) + log| F | ci - l«g|F| ^ ! > 
where C£ = £ + 1 ifFis the two-element field ¥2, and eg = (|F| — 1)/(|F| — 2) otherwise. 
Proof. It suffices to prove the second inequality. As |F| codim c(C) < £^ =q (^)(|F| - If by the definition of 

£, it suffices to show that ^=0 (t)(\ ¥ \ - 1)* < C KI^I - l) £ d7^, or more generally, £™ n (^)(g - l) 1 < 
c m(9 — i) m N m /m\ for all integers TV > m > and q > 2, where we put := (q — l)/{q — 2) if q > 3 and 
c4j := m+ 1 if q = 2, and we set 0° = 1 (note that I < codimc(C') < d). We use induction on m. The case 
m = is trivial. For the case m > 1, we have 



i=0 v 7 i=0 



^ (^i)! + U ,(9 5 



< 



m! V^- 1 )^ 
By the relation m < N and the definition of c' m , we have 

c' im 



(35) 

c„_!(g- 1)™- 1 ^™-! ( 9 -i)"W™ 

(m — 1)! m! 
(g - l) m N n 



nv ■ l<^f + l = ^ , (36) 
(g-l)iv g-1 

therefore the desired inequality holds for this m as well. Hence the claim of Proposition @] holds. □ 



5.2 Boolean functions of low degrees 

As a first concrete example, here we deal with the set C of the functions X — > Y with 71-bit inputs and 1-bit 
outputs, i.e., we set X := {0, 1}™ and Y := {0, 1} (which is relevant to the situation of SectionU]). First note 
that, when we identify {0, 1} naturally with Fa, each function / : X — > Y can be expressed as an n-variable 
square-free polynomial; 

f{x 1 ,...,x n ) = ^2 f{a)Xa(xi,...,x n ) (37) 

o=(ai,...,o n )e{0,l}'> 

where we put 

Xs(xi,...,x n ) := Yi^-Xi) XI x i for 8 = ( a i> • • • > a n) (38) 

(note that xs(xi, ■ ■ ■ , x n ) = 1 if £j = a, for every i and Xa( x it ■ ■ • j x n) = otherwise, therefore \s is indeed 
the characteristic function of a € {0, 1}™). For example, when n = 2 we have 

f(x u x 2 ) = /(0, 0)(1 - - x 2 ) + /(0, 1)(1 - u)a! 2 
+ /(l,0)zi(l-z 2 ) + /(l,l)ziz 2 • 

Now for each < k < n, we set C = C' k to be the subset of C consisting of functions that can be expressed 
as a square-free polynomial of degree < k. For example, Cq is the set of constant functions, and C[ is the 
set of affine functions. The distance d(f,g) — c£h (/,<?) is defined as in @. Note that changing the value of 
/ G C at a point a £ {0, l} n is equivalent to adding the function \s to the /. In this situation, we have the 
following upper and lower bounds for the quantity r(C,C' k ): 
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Proposition 5. In the above setting, put u n ^ k '■= Y17=k+i w)> an< ^ ^ ^ n - k ^ e ^ e ^nimum integer i 
satisfying that 2 u "- k < £)' =0 (T)- Th 

en we have 

*»,k < r(C, C' k ) < min{ Un , fe , 2™- 1 } . (40) 

Proof. For the upper bound, note that r(C,C' k ) < 2 n_1 , as any function / G C can be converted into a 
constant function by changing the value f{x) at every point x 6 {0, 1}™ with the property that f(x) is in 
the minority among the 2™ values of / (the number of such points is at most 2 n_1 ). Then the upper bound 
follows from Proposition [31 as C is an F2-vector space of dimension 2™ and C' k is its subspace of codimension 
ii„i. The lower bound also follows from Proposition [3] □ 

By Proposition 21 the quantities £ n ,k and u n . k in Proposition [S] satisfy the relation £ n , k < u n ,k < n£ n , k + 
log 2 (£ ni fc + 1) — log 2 £ n ,k" Table [1] gives the precise values of l n .k for some smaller cases. 



Table 1: The values oi l n ,k for some small parameters 
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16 


31 


49 
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G 


19 


43 


75 


105 128 



Here we introduce a geometric point of view to the above problem. We introduce some notations. For a 
subset / C [n] := {1, 2, . . . , n}, put xi :— Yiiei Xi ' an< ^ ^ ai De t ne element (a\, . . . , a„) of {0, 1}™ determined 
by a% — 1 when and only when i £ I. We write Si := % aj for simplicity. Let A^ _1 be the disjoint union 
of an isolated point P and the standard (n — l)-simplex A™ -1 on the vertex set [n]; we regard P as "the 
(— l)-dimensional face" of A™ -1 . For each ^ I C [n], let (J) denote the (|/| — l)-dimensional sub-simplex 
of A™ -1 spanned by /, and let be its relative interior (note that ({i})° — ({«}) — {*} f° r each i g [n]). 
On the other hand, we put (0) = (0)° := P. Now for each function f(x) — J2ic[n] ClXl ( Cl e ^ 2 ), we define 
its geometric realization Gf by 

Gf := |J (I c )° (disjoint union), (41) 

I;c I = l 

where I c denotes the complement [n] \ I of / in [n] . For each / C [n] , by the definition and the fact that 
5i = ^2 j-si xj (recall that now the values of functions are in F 2 ), Gsj is the (disjoint) union of P and (J)° 
for all ^ J C I c , therefore we have Gs T — P U (I c ). Moreover, for any < k < n and / c [n], we have 
|/| > k + 1 if and only if (I c ) is at most (n — k — 2)-dimensional. This implies that a function / E C belongs 
to C' k if and only if Gf does not intersect with the (n — k — l)-dimensional skeleton A™_ fe _ 1 of A" , which 
consists of the faces of A" -1 of dimension up to n — k — 1. 

Based on the above observation, we consider the following puzzle. We imagine a situation that a lamp 
is associated to each face of A™ -1 . A state of A™ -1 is a collection of light/dark properties of all the lamps. 
Given a function /, the corresponding initial state If is defined in such a way that a lamp at a face is light 
if and only if the relative interior of the face is contained in Gf. At any state, the player of the puzzle is 
allowed to indicate a face F of A™ -1 (we call it 11 push the face i 7 "'), then the light /dark properties of lamps 
at P and every sub-face of F are flipped; such a process is regarded as a move of the puzzle. An initial state 
If is said to be solved when the lamps of all faces of A™_ fe _ 1 are switched off by a sequence of moves started 
from If. With this interpretation, the distance d{f,C' k ) from / e C to C' k is the minimum of the number of 
moves to solve If, and the quantity r(C,C' k ) is the minimal necessary number of moves to solve any initial 
state. 
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Moreover, we also introduce a simplified puzzle on A" -1 instead of A™~ by ignoring the isolated point 
P in the above puzzle. Let r' n k denote the minimal necessary number of moves to solve (for the simplified 
puzzle) any initial state. Then we have r(C,C' k ) = r' n k + 1, as for an initial state X of the simplified puzzle 
for which solving X requires precisely r' n k moves, one of the two initial states of the original puzzle obtained 
by adding a lamp at P which is light and dark, respectively, requires r' n k + 1 moves. Hence it suffices to 
consider the simplified puzzle on A™ -1 for determining the quantity r(C,C' k ). 

Example 1. We set n = 4 and show that r(C,C() = 6, or equivalently r' 41 = 5. Note that the general 
bounds in Proposition [5] only guarantee that 4 < r(C,C[) < 8 (note that U4.1 = 11 > 8 = 2 4_1 ). We identify 
naturally each state in the puzzle on A"" 1 = A 3 with each family of non-empty subsets of [n] = [4], and we 
write {11,12, ■ . . , ii} as i\i2 ■ • • it for simplicity. Moreover, to express each state we omit the subsets of [4] of 
size larger than 2, as the lamps at faces of dimension at least n — k — 1 = 2 are not relevant to determine 
whether the puzzle has been solved or not. In other words, in the present situation, we can regard each state 
as edge and vertex coloring of the complete graph K 4 . 

First, we show that the initial state X = {13,24} requires more than 4 moves to solve. Assume 
contrary that X can be solved by at most 4 moves. If the player pushes the face 1234, then a state 
{1,2,3,4,12,23,34,41} is obtained. To solve the state by at most 3 remaining moves, the player has to 
push at least one 2-dimensional face; we may assume by symmetry that the face is 123. Then the resulting 
state is {4, 13, 34, 41}; however, a case-by-case analysis shows that to solve the state by at most 2 remaining 
moves is impossible. Therefore the player does not push the face 1234. On the other hand, if the player 
pushes a 2-dimensional face, then we may assume by symmetry that the face is 123, resulting in a state 
{1,2,3,12,23,24}. To solve the state by at most 3 remaining moves, the player has to push at least one 
more 2-dimensional face. If it is 124, then we obtain a state {3, 4, 23, 41}, but a case-by-case analysis shows 
that to solve the state by at most 2 remaining moves is impossible (the case of 234 is similar by symme- 
try). If it is 134, then we obtain a state {2, 4, 12, 13, 14, 23, 24, 34}, but a case-by-case analysis shows that 
to solve the state by at most 2 remaining moves is impossible as well. Therefore the player does not push 
a 2-dimensional face. This implies that the player should push 13 and 24, resulting in a state {1,2,3,4}, 
from which to solve the state by at most 2 remaining moves is impossible. Hence we have a contradiction, 
therefore the initial state S = {13, 24} indeed requires more than 4 moves to solve. 

Secondly, we show that any initial state X can be solved by at most 5 moves. The player can solve X by 
at most 4 moves when no lamps in X at 1-dimensional faces are light, therefore X can be solved by at most 5 
moves when at most 1 lamp in X at 1-dimensional face is light. When 2 lamps in X at 1-dimensional faces are 
light, a case-by-case analysis shows that X can be solved by at most 4 moves unless X is of the form {ii?2, £3*4} 
with {11,12} H {13,^4} = 0, and for any X of the latter form, X can be solved by pushing the faces 1234, 
*i*2*3, ii^H, Hi and %i- When 3 lamps in X at 1-dimensional faces are light, the problem can be reduced to 
the case of 2 light lamps at 1-dimensional faces by pushing one of the 3 light lamps at 1-dimensional faces. 
When 4 lamps in X at 1-dimensional faces are light, the problem can be reduced to the case of 2 light lamps 
at 1-dimensional faces by pushing the face 1234 unless X is of the form {1, 2, 3, 4, i\iz-, iiH, *2*3, ^2*4} with 
{21,12} H {13,14} = 0, and for any I of the latter form, X can be solved by pushing the faces 11*2*3, 
it, and i%. When 5 lamps in X at 1-dimensional faces are light, the problem can be reduced to the case of 2 
light lamps at 1-dimensional faces by pushing an appropriate 2-dimensional face. Finally, when 6 lamps in X 
at 1-dimensional faces are light, the problem can be reduced to the case of no light lamps at 1-dimensional 
faces by pushing the face 1234. Hence any initial state X can be solved by at most 5 moves, therefore we 
have r 4 j = 5 as desired. 

From now, we investigate FDPs in the above setting by using Grobner bases. Recall that X — {0, 1}™. 
Let R :— K[z v | v € X] be a polynomial ring in 2™ variables over a field K of characteristic 0. We define the 
following ideal of R: 

I := (z v 2 -l\veX)cR . (42) 

For each / € C, put 

z f := J] zj^ . (43) 

vex 



14 



Then the set {z f \ f £ C} of all square-free monomials in R forms a linear basis of the quotient ring 
Aq := R/Io- Note that 7J z 9 — z^ +9 (mod Iq) and the degree deg(-z^) of z* in R is equal to d(f,0) for any 
/, j £ C, where denotes the function in C taking constant value 0. 

Let C be a subset of C, which need not be a linear subspace of C unless otherwise specified. We define 
the following ideal of R: 

I c > := (z f - z 9 \ f,g e C) C R , (44) 

and consider the ideal Ic ■= Iq + Ic of R. We identify the quotient ring Ac := R/I& with the quotient 
ring of Aq by the image of Ic- Now fix a graded monomial order, i.e., a monomial order -< satisfying that 

HveX z -" av ~* Uvex z ^ v for an y exponents {a v ) veX and (/3 v ) veX with J2vex a « < J2veX Let C be a 
Grobner basis for the ideal Ic, and consider the reduction process with respect to the Grobner basis G. As 
each generator of Ic is of the form "(monic monomial) — (monic monomial)" , G can be chosen in such a way 
that every element of G is of the same form, and the linear basis of Aq consisting of the square-free monic 
monomials can be partitioned into equivalence classes when projected onto the quotient ring Ac- This also 
implies that the normal form nf(z^) of each / £ C with respect to G is a square-free monic monomial, i.e., 
of the form z 9 with g £ C, and we have 

deg(nf(z^)) = min{deg(z^ ) | z* = z$ (mod Ic)} ^ 
= min{deg(z^ ) | z> — z* £ Ic} ■ 

Now we consider the case that 6 C. Note that z^J = 1 = z- (mod Iq) for any / £ C. Now if 
f,g £ C and = z 9 (mod Ic), then we have z^ +9 = z^ z 9 = z$ z$ = z- (mod Ic), therefore / + g € C. 
Conversely, if / + g 6 C , then we have z^ z 9 — z^ +9 = z- = 1 (mod Ic), therefore z^ — z^z 9 z 9 = z 9 
(mod Ic)- Hence deg(nf(,z^)) is equal to the minimal degree of z 9 with g 6 C satisfying that / + g 6 C , 
therefore d(f, C) = deg(nf (z^)). This argument reduces the FDP in this setting to the problem of computing 
(the degrees of) the normal forms of square-free monomials. More precisely, let hi denote the number of 
monic monomials in Ac whose normal forms have degree i, and put s := max{i | hi > 0}. (If the ideal is 
homogeneous, then (hi)i is called the Hilbert function and it does not depend on the choice of a monomial 
order.) Now the above argument implies that r(C,C) = s. Moreover, if C is a linear subspace of C, then 
we have hi • \C'\ — \{f £ C \ d(f,C) — i}\, therefore the data (hi)i express the distributions of the distances 
d(f,C) over the functions / 6 C. 

Based on the above argument, Proposition [3] can be restated for the present case as follows: 

Proposition 6. In the above setting, suppose that C is a linear subspace ofC. Then we have 

mm{/|^y >—}<r(C,C')< Wl . (46) 

Proof. Note that the number of monic monomials in Ac is 2 n /\C'\. Then the lower bound follows from 
the fact that the normal form of each monic monomial is also a monic monomial and that there exist (™) 
square-free monic monomials of degree i, hence hi < (™) . On the other hand, the upper bound is deduced 
from the fact that each divisor of a monic monomial of normal form is also of normal form, hence hi = if 
hj = and j < i. This concludes the proof. □ 

For the case C = C' k as discussed above, Table [2] shows a calculation result of r(C,C' k ) and (hi)i for small 
parameters n and k, which is obtained by using computer algebra software Singular /Sage. By the table, 
we have r(C,C' k ) — 6 when (n, k) — (4, 1), as explained in Example [TJ Note that the values of r(C,C' k ) in 
Table [2] are consistent with the lower bounds shown in Table [TJ 

5.3 Perfect codes and Reed— Solomon codes 

In this subsection, we consider the case that C is an n-dimensional vector space over the g-element field 
F q , hence C is identified with ¥ q n , the distance d(-, •) is defined to be the (generalized) Hamming distance 
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Table 2: Computer calculation result for some small parameters 



n 


k 




(hi)i>o 


2 


1 


i 


(1,1) 


3 


1 


2 


(1,8,7) 


3 


2 


1 


(1,1) 


4 


1 


6 


(1,16,120,560,875,448,28) 


4 


2 


2 


(1,16,15) 


4 


3 


1 


(1,1) 



(with respect to the vector expressions of elements), and C is a linear subspace of C coming from the coding 
theory. Let the subspace C be an (n, to, d)-code, i.e., dim(C) = to and the minimum distance of C is d. By 
the definition of minimum distance, we have the following well-known relation 

[d/2\ . 

i m E U ) {q - 1]i - qn ■ (47) 

This and the argument in Proposition [3] implies that r(C,C) > [d/2\. 

We say that C is a perfect code, if the equality holds in (|47[) . For a perfect code C , the above argument 
and Proposition |3] implies that r(C,C) = [d/2\. For example, if n = 2 k — 1, q = 2 and C is the Hamming 
code Hk which is a (2 k — l,2 fe — k — 1, 3)-code, then we have r(C,C) = 1. On the other hand, if n = 23, 
q = 2 and C is the binary Golay code G23 which is a perfect (23, 12, 7)-code, then we have r(C,C) = 3. (In 
the case of the extended Golay code C' = G24 which is a nearly perfect (24, 12, 8)-code, where we set n = 24 
and q = 2, we also have r(C,C) = 4 in a similar manner.) 

As another concrete class of C for which the quantity r(C,C) can be explicitly determined, from now 
we study the case of Reed-Solomon codes, which is also an important class of linear codes. We write q = p e 
with a prime number p and an integer e > 1, and choose an integer k with 1 < k < n. Take a primitive 
element a of ¥ q , i.e., F g x = (a). Define a polynomial G(x) € V q [x) of degree n — k by 

G{x) := (x - l)(x - a)(x - a 2 ) ■ ■ ■ (x - a"^" 1 ) . (48) 

For any integer j > 0, let Pj denote the set of polynomials in ¥ q [x] of degrees up to j, which is a (j + 1)- 
dimensional F g -linear subspace of F q [x]. We identify P n -\ with C via the correspondence X)"=o 1 a " iX% 
S™=o 1 a i w i, where (vq, . . . , v n -i) is a distinguished linear basis of C. Now we introduce the following two 
linear maps: 

Vn,k ■ Pk-1 -> Pn-l , f{x) ^ G(x)f(x) , (49) 

^n,k ■ Pn-i -> F,"- fe , /(a) 1 ^ (/(l), /(a), /(a 2 ), . . . , /(a""^ 1 )) . (50) 

Let C be the image of ^n^, which is a subspace of C (via the above identification C ~ P„_i). This C is a 
Reed-Solomon code. Note that C coincides with the kernel of VVfc- Now we have the following result: 

Proposition 7. In the above setting of Reed-Solomon code, we have r(C,C) = n — k. 

Proof. As dim(C') = k, the inequality r(C,C) < n — k follows from Proposition [3] From now, we show that 
r{C,C) > n — k, or equivalently, there exists an element u € C satisfying that d(u,C) > n — k. 

For each polynomial f(x) E P n -i, the condition d(f(x),C) < n — k — 1 is equivalent to the following: 
There exist indices < vi < v 2 < ■ ■ ■ < v n -k-i < n — 1 and coefficients c,- 6 V q (1 < j < n — k — 1) for 
which we have f(x) — Y^j=i 1 c j x " j S C = ker-0„ fc, or equivalently, 

n—k—l 

f{a l )= Cj^ 4 for every < i < A: - 1 , (51) 
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where we put /3 Vj := a Vj . The condition (j5Tj) can be expressed as 



/(«°) 



5 



l-fc-1 



l-fc-1 



l-fe-1 



(52) 



where c denotes the column vector (ci, C2, . . . , c n -k-i)- For simplicity, let B and & denote, respectively, the 
first n — k — 1 rows and the last row of the above matrix; i.e., the above condition is written as 



t (f(a°)J(a 1 ),...,f(a n - k - 1 )) 



B 
b 



(53) 



Now, as a is a primitive element of ¥ q , all /3 Uj are distinct with each other and hence B is a Vandermondc 
matrix which is invertible. Therefore, the condition (|5"Tj) implies that c = B^ 1 ^(/(a ), /(a 1 ), . . . , f(a n ~ k ~ 2 )) 
and f{a n ~ k ~ 1 ) = be. On the other hand, the latter condition is not satisfied when f(a l ) = for every 
< i < n — k — 2 and /(a" _fe_1 ) ^ 0, e.g., /(x) = n2=o ^C 35 ~~ Hence this element f(x) G C satisfies 



that d(f(x),C) > n — k, as desired. This concludes the proof of Proposition [7] 



□ 



6 Concluding remarks 

In this paper, we first specified a class of mathematical problems, which we call Function Density Problems. 
Then we pointed out novel connections of Function Density Problems to theoretical security evaluations of 
keyless hash functions and to constructions of provably secure pseudorandom generators with some enhanced 
security property. Our argument aimed at proposing new theoretical frameworks for these topics (especially 
for the former) based on Function Density Problems, rather than providing some concrete and practical 
results on the topics. We also gave some examples of mathematical discussions on the problems, which 
would be of independent interest from mathematical viewpoints. 

To conclude this paper, we discuss some possible directions of future works. First, there exist some 
cryptographic protocols for which the constructions are motivated by some NP-complete/NP-hard problems, 
but actually the distributions of the problem instances in the protocols are somewhat biased, therefore it has 
not succeeded to prove the security of the protocols directly from the hardness of the underlying problems 
(e.g., McEliece cryptosystem and other code-based protocols relevant to decoding problem for random linear 
codes; knapsack cryptosystem relevant to Subset Sum Problem; etc.). We hope that the idea of Function 
Density Problems can be applied to measure the closeness of the approximations of the underlying hard 
problems in those protocols. Secondly, for the mathematical characteristics of Function Density Problems, 
it would be interesting to evaluate the computational difficulty of Function Density Problems (e.g., to prove, 
if possible, that Function Density Problems are NP-hard). Moreover, as the examples of Function Density 
Problems in this paper are for the case that the subset C of C forms a linear subspace, it would be also 
significant to study the other cases that C is not a linear subspace of C. 
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